Example Report: Distribution of train stops in germany¶

This example uses open data from Mobilithek (https://download-data.deutschebahn.com/static/datasets/haltestellen/D_Bahnhof_2020_alle.CSV) to render a map of germany with all train stops and operators marked.

The question that interests us is: Who runs trainstops in germany and where?

Install dependencies¶

Initially, install all required dependencies. The specific version of SQLAlchemy is needed because SQLAlchemy 2.0 does not work with pandas yet. nbformat allows the use of the "notebook" formatter for the plot, others can not be rendered to HTML.

In [2]:
%pip install pandas
%pip install plotly
%pip install 'SQLAlchemy==1.4.46'
%pip install nbformat
Requirement already satisfied: pandas in /usr/local/lib/python3.11/site-packages (1.5.3)
Requirement already satisfied: python-dateutil>=2.8.1 in /Users/pheltweg/Library/Python/3.11/lib/python/site-packages (from pandas) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.11/site-packages (from pandas) (2022.7.1)
Requirement already satisfied: numpy>=1.21.0 in /usr/local/lib/python3.11/site-packages (from pandas) (1.24.2)
Requirement already satisfied: six>=1.5 in /Users/pheltweg/Library/Python/3.11/lib/python/site-packages (from python-dateutil>=2.8.1->pandas) (1.16.0)

[notice] A new release of pip available: 22.3.1 -> 23.0.1
[notice] To update, run: python3.11 -m pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.
Requirement already satisfied: plotly in /usr/local/lib/python3.11/site-packages (5.13.1)
Requirement already satisfied: tenacity>=6.2.0 in /usr/local/lib/python3.11/site-packages (from plotly) (8.2.2)

[notice] A new release of pip available: 22.3.1 -> 23.0.1
[notice] To update, run: python3.11 -m pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.
Requirement already satisfied: SQLAlchemy==1.4.46 in /usr/local/lib/python3.11/site-packages (1.4.46)
Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.11/site-packages (from SQLAlchemy==1.4.46) (2.0.2)

[notice] A new release of pip available: 22.3.1 -> 23.0.1
[notice] To update, run: python3.11 -m pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.
Requirement already satisfied: nbformat in /usr/local/lib/python3.11/site-packages (5.7.3)
Requirement already satisfied: fastjsonschema in /usr/local/lib/python3.11/site-packages (from nbformat) (2.16.3)
Requirement already satisfied: jsonschema>=2.6 in /usr/local/lib/python3.11/site-packages (from nbformat) (4.17.3)
Requirement already satisfied: jupyter-core in /Users/pheltweg/Library/Python/3.11/lib/python/site-packages (from nbformat) (5.2.0)
Requirement already satisfied: traitlets>=5.1 in /Users/pheltweg/Library/Python/3.11/lib/python/site-packages (from nbformat) (5.9.0)
Requirement already satisfied: attrs>=17.4.0 in /usr/local/lib/python3.11/site-packages (from jsonschema>=2.6->nbformat) (22.2.0)
Requirement already satisfied: pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 in /usr/local/lib/python3.11/site-packages (from jsonschema>=2.6->nbformat) (0.19.3)
Requirement already satisfied: platformdirs>=2.5 in /Users/pheltweg/Library/Python/3.11/lib/python/site-packages (from jupyter-core->nbformat) (3.0.0)

[notice] A new release of pip available: 22.3.1 -> 23.0.1
[notice] To update, run: python3.11 -m pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.

Load data¶

Create a pandas dataframe using the local sqlite file.

In [5]:
import pandas as pd

df = pd.read_sql_table('trainstops', 'sqlite:///data.sqlite')

Who runs trainstops in germany and where?¶

To answer our initial question, we use plotly to draw a scatterplot of all train stops in the dataset, overlaying it on a map from OpenStreetMap.

The train stops will be colored based on the Betreiber_Name, allowing us to see what area an operator services.

In [6]:
import plotly.io as pio
import plotly.express as px

pio.renderers.default = "notebook"

fig = px.scatter_mapbox(df, 
                        lat="Breite", 
                        lon="Laenge", 
                        hover_name="NAME", 
                        hover_data=["EVA_NR", "DS100", "Betreiber_Name"],
                        color="Betreiber_Name",
                        zoom=5, 
                        height=800,
                        width=1200)

fig.update_layout(mapbox_style="open-street-map")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()